Multiplex Snapshot Minisequencing for the Detection of Common PAH Gene Mutations in Iranian Patients with Phenylketonuria

Background: Phenylketonuria is a common inborn defect of amino acid metabolism in the world. This failure is caused by an autosomal recessive insufficiency of the hepatic enzyme PAH, which catalyzes the irreversible hydroxylation of phenylalanine to tyrosine. More than 1,040 different disease-causing mutations have already been identified in the PAH gene. The most prominent complication of PKU, if not diagnosed and treated, is severe mental retardation. Hence, early diagnosis and initiation of nutritional therapy are the most significant measures in preventing this mental disorder. Given these data, we developed a simple and rapid molecular test to detect the most frequent PAH mutations. Methods: Multiplex assay was developed based on the SNaPshot minisequencing approach to simultaneously perform genotyping of the 10 mutations at the PAH gene. We optimized detection of these mutations in one multiplex PCR, followed by 10 single-nucleotide extension reactions. DNA sequencing assay was also used to verify genotyping results obtained by SNaPshot minisequencing. Result: All 10 genotypes were determined based on the position and the fluorescent color of the peaks in a single electropherogram. Sequencing results of these frequent mutations showed that by using this method, a 100% detection rate could be achieved in the Iranian population. Conclusion: SNaPshot minisequencing can be useful as a secondary test in neonatal screening for HPA in neonates with a positive screening test, and it is also suitable for carrier screening. The assay can be easily applied for accurate and time- and cost-efficient genotyping of the selected SNPs in various population.


INTRODUCTION
henylketonuria, the most common inborn defect of amino acid metabolism caused by a PAH deficiency, was first introduced by Asbjorn Folling in 1934 [1] . The prevalence of this metabolic disease among whites has been reported as 1 in 10,000 and in the Iranian population as 1 in 3,627 live births [2] . PKU is the first known metabolic cause of mental retardation and the first genetic disorder of the central nervous system that can be fully treated by modification of external factors such as diet. PKU is also the first disorder successfully diagnosed by universal neonatal screening. Understanding the biochemical and molecular basis of PKU is of great importance in treatment strategies of this condition, resulting in a significant reduction in morbidity and an improvement in the quality of life [3,4] . Untreated PKU is linked to an atypical phenotype that includes intellectual disability, microcephaly, seizures, growth failure, and poor skin pigmentation. However, with early dietary intervention and the advent of newborn screening programs, PKU-positive infants can today be expected to have largely normal lives [5,6] . Mutations in the PAH gene on chromosome 12q23.2, are alterations that lead to the majority of PKU and HPA types [7] . Until now, over 1,040 disease-causing mutations have been reported [8,9] . The majority of these changes are corresponding to point mutations and cause missense mutations [7,10] .
Profound cognitive impairment caused by PKU is often managed by detecting the PAH deficiency in the newborn period (the first week of life) and prevented by initiating a specialized diet [11] . Genotyping, in most cases, is beneficial to the prediction of the phenotypic outcome as early as possible after birth. Besides, molecular confirmation is of significance in the diagnostic algorithm of HPA [8,12] . Sanger DNA sequencing, polymerase chain reaction-restriction fragment length polymorphism, and amplificationrefractory mutation system are commonly used in research labs to identify SNPs [13] . Therefore, there is a need for a reliable, sensitive, and low-cost assay to detect numerous mutations in a single experiment as the assays are expensive and time-consuming in circumstances where several SNPs need to be examined [14] . The multiplex SNaPshot minisequencing assay uses a single-tube reaction to investigate SNPs at the specified locations. Its multiplex capabilities enable the analysis of more than 10 SNPs in a single reaction, regardless of their chromosomal locations or the distance between them and nearby SNP sites. Dideoxy single-base extension of an unmarked oligonucleotide primer(s) is/are the basis of its chemical reaction. Using DNA polymerase and a single suitable ddNTP that matches the nucleotide at the target site, a primer is hybridized to DNA close to a variant nucleotide site and extended using SNaPshot minisequencing. Capillary electrophoresis is used to separate and fluorescently detect the extended products [15] . The technique has demonstrated 100% sensitivity and 100% specificity, enabling researchers to use it as a quick confirmation test to achieve early genotyping, following a positive neonatal screening result. This genotyping test was successfully identified both alleles in PKU patients and it was quick and affordable. Considering these benefits, we developed a SNaPshot minisequencing method to detect ten common mutations, including, IVS2+5 G>C, IVS2-13 T>G, c.473G>A(p.R158Q), c.526C>T (p.R176*), c.691T >C (p.S231P), c.782G>A (p.R261Q), IVS 9+5 G>A, IVS 10-11 G>A, c.1068C>A( p.Y356*), and c.1208C>T (p.A403V). These point mutations are the most frequent reported mutant alleles in the studies carried out in different regions of Iran [16,17,[18][19][20][21] . Except for IVS2+5, IVS2-13, IVS 9+5 and IVS 10-11, which are in intronic regions, the rest are in exonic regions. One of the selection criteria for the SNPs was their pathogenicity. All the selected SNPs are pathogenic variants that cause HPA in patients.

Selection of SNPs
There are no accurate statistics on the incidence of PKU in Iran. However, based on the reports, its prevalence in the country is about 0.027% (1 in 3,627 live births) [22,23] . Distribution of different types of PAH mutations varies in different regions of Iran. According to the studies carried out in various regions of Iran [16,17,[18][19][20][21] , 10 common mutations with high frequencies were selected in our study ( Table 1).

Selection of samples
Samples were selected from the patients referred to two PKU reference laboratories in Tehran and Ahwaz cities of Iran. The case files for 70 families of patients with suspected PKU phenotype were investigated to select 30 individuals with 10 common mutations. Three different individuals were chosen for each mutation as follows: one mutant homozygote, one normal homozygote, and one heterozygote. The salting out procedure was used to extract the genomic DNA from peripheral blood, which was then maintained at -20 °C for long-term storage [24] . A Nanodrop ND-1000 spectrophotometer was employed to evaluate the purity and concentration of DNA (NanoDrop Technologies, USA).

Minisequencing primer design
Ten minisequencing primers were designed by Gene Runner software to identify each SNP from the eight amplified DNA fragments. Multiplex minisequencing primers were designed according to the Ensemble reference sequence. A primer melting temperature was calculated using the Biomaths Calculator program (https://worldwide.promega.com/resources/tools/bioma th/tm-calculator/) by annealing one base before the SNP to be analyzed. Therefore, the polymorphisms would be identified by the base after the minisequencing primer for each SNP. The minisequencing products generated for the SNPs differ significantly in size and can easily be distinguished in a single capillary electrophoresis run [25] . Minisequencing primers were between 15 and 25 nucleotides in length with melting temperatures ranging from 49 to 53 o C. The sequences of the primers were checked for possible hairpin structures or primer-dimer formation as described above and were further tested for self-   0.13 61 20 R: 5'-CCACAGCCTCAGGTGTTTGA-3' F, forward primer; R, reverse primer; nt, nucleotide extension in control PCRs without a template. In order to allow adequate separation of the primer-extension products in a single capillary electrophoresis, a nonhomologous neutral sequence (a part or whole of a 40nucleotide-long sequence [5'-aactgactaaactaggtgcc acgtcgtgaaagtctgacaa-3']) was incorporated at the 5'ends of the minisequencing primers to adjust their lengths and obtain a more balanced base composition [26] . The neutral sequence is a random sequence that does not match with any human sequence in the NCBI database [27] . The length of the primers was modified by adding a neutral sequence for all SNPs, except for Set-1. Their final lengths ranged from 17 to 80 nucleotides, and each product of the primer was spaced at least four nucleotides from the nearest primer product (Table 3). To improve electropherograms and inhibit the overlapping of nearby primers, which leads to nonspecific extensions, some primers were created on the reverse strand (marked with R in Table 3).

Multiplex PCR amplification
Genotypic detection of PAH SNPs was performed using mPCR, followed by a multiplex SNaPshot minisequencing reaction. All amplicons were first tested in a singleplex PCR (not shown), and then eight pairs of multiplex primers were set up in one tube for convenience and low cost. Subsequently, a mix of eight primers was made, and then 3.5 µL of mix primer (final concentrations ranged 0.087-0.13 μM; Table 2), was added to 12.5 µL of PCR Master Mix. Next, 100 ng of genomic DNA was added to DNase/RNase-free distilled water and reached the final volume of 25 μL. PCRs were performed using TEMPase Hot Start Master Mix (Ampliqon, Denmark). Amplification was carried out in a thermocycler (Thermo Fisher Scientific, USA). After a pre-incubation step at 95 °C for 10 min, PCR was performed for a total of 30 cycles using the following conditions: denaturation at 95 °C for 30 s, annealing at 67 °C for 60 s, and extension at 72 °C for 60 s, followed by 10-min final extension at 72 °C.

Multiplex SNaPshot reactions
To set up minisequencing primers, exonuclease I and shrimp alkaline phosphatase enzymes were added to the mPCR products to clean the PCR product and remove single-stranded and non-specific sequences, excess primers, and unincorporated ddNTPs. PCR product (3 µL) with 1 µL of exonuclease I/shrimp alkaline phosphatase enzyme mixture (Affymetrix, Product no: 78200/01/02/05/50, USA) was combined and placed in a thermocycler. After 15 minutes at 37 °C, the reaction mixture was incubated at 80 °C for 15 min to inactivate the enzymes, after which they were kept at 4 °C until later use.  Through a fluorescent ddNTP, the reaction expands the minisequencing primers and generates unique products that are distinct to each SNP allele. On 10 independent control samples of DNA from whole blood used as controls, the minisequencing primers were first set up, individually. The reactions were optimized to define the best concentration and find the electropherograms without background noise, an ideal peak resolution and homogeneous peak height. Furthermore, the optimal amount of template genomic DNA was experimentally defined by testing different concentrations of DNA template for each Multiplex SNaPshot Mix. The test conditions were evaluated using DNAs for the known mutations. All 30 participants included in the test were analyzed for heterozygosity and homozygosity, and the electropherograms were re-evaluated for peak height, size, and resolution. Finally, the SNaPshot minisequencing reaction was performed in a 5-μL final volume using 1 μL of the treated PCR product, 2 μL of the minisequencing primer cocktail (final primer concentrations ranged 0.01-0.4 μM; Table 3), and 1 μL of SNaPshot® Multiplex Ready Reaction Mix (Applied Biosystems, Poland). The primer extension conditions consisted of 96 °C for 10 s, followed by 25 cycles for 10 s at 96 °C, 5 s at 50 °C, and 30 s at 60 °C and then kept at 4 °C until further use. Afterwards, the samples were treated with calf intestinal alkaline phosphatase (New England BioLabs, Whitby, Ontario, Canada) at 37 °C for 45 min, followed by 15 min at 75 °C for enzyme inactivation. The Multiplex SNaPshot reaction mix products (1 µL) were mixed with 8.8 µL of HiDi™ formamide (Thermo Fisher Scientific) and 0.2 µL of Gene Scan 120 LIZ as a size standard (Thermo Fisher Scientific). Capillary electrophoresis was undertaken on an ABI PRISM 3130XL Genetic Analyzer (Thermo Fisher Scientific) using POP-7 polymer. Multiplex extension products were visualized and analyzed automatically with GeneMapper™ Software (Thermo Fisher Scientific).

Validation of the multiplex minisequencing assay
Sanger sequencing was performed to confirm the multiplex minisequencing results ( Supplementary  Figs. 1-20).

Multiplex PCR product analysis
The products of PCR reaction were analyzed by agarose gel electrophoresis using a 5-μL aliquot from the total reaction. The amplified products were run on a 1.5% agarose gel in a 0.5× Tris-Borate-EDTA solution at 90 v. The size of the products are listed in Table 2, and an example of PAH PCR amplified products is represented in Figure 1. Reaction products were stored at -20 °C until usage.

Minisequencing data analysis
We used multiplex minisequencing on the amplified PCR mixture of four terminator nucleotides (ddNTPs), each of which was labeled with a different fluorescent compound to find the point mutations in the PAH gene. One of the four dye terminators extended each primer molecule, and the fluorescent tag(s) was/were attached to the extended products. In minisequencing, a cycle sequencing reaction was carried out in the presence of Taq DNA polymerase, i.e. a mutation-detection primer was annealed such that its three nucleotide ends were before the mutation site, and a primer was served as a reporter of the wild-type and/or mutant genotype of the template DNA. Only a wild-type or mutant dye terminator was linked to the primer in wild-type or homozygous mutant samples. As a result, only one primer peak was shown on an electropherogram. However, in a heterozygous sample, both the wild-type and mutant dye terminators bind to the mutation detection primer, resulting in the detection of two distinct fluorescence signals or peaks. In addition, wild-type and mutant allele peak heights may significantly differ because of the differences in fluorescence emission of the fluorophores [28] (Figs. 2  and 3). Data were analyzed by using the GeneMapper software. As each SNP minisequenced-treated fragment moved through the POP-7 polymer, its relative mobility to GeneScan 120LIZ size standards (15-120 nucleotide fragments) was used to determine its size [25] . The relative sizes and signal colors for each allele (major or minor) and SNP are shown in Table 4. The interval of 4, 8, and 9 nucleotide length differences between the neighboring minisequencing primers provided sufficient separation in the data after analysis.

Validation of the multiplex minisequencing assay
To estimate the precision of the assay, we screened the same cohort of patients sequenced for all 10 SNPs. The ratio of the correctly recognized SNPs (true positives), and the entity of any false-positive or falsenegative peaks was tested. Our minisequencing results showed a 100% consistency with the genotypes defined by sequencing. No false-negative or falsepositive results were detected. These data proved that the new developed multiplex minisequencing technique was highly accurate and appropriate for detecting 10 polymorphisms.

DISCUSSION
Identification of variants in the PAH gene is necessary to verify the diagnosis, select the treatment tactics, and detect the heterozygous carriers. In this regard, various studies have been conducted to identify the distribution of PAH gene mutations in different regions of Iran. In 2018, Esfahani et al. [21] conducted a relatively complete study on the mutation spectrum of the PAH gene in the Iranian population. In that study, 34 different mutations were recognized with 100% mutation detection rate. IVS10-11G>A, p.P281L, R261Q, p.F39del, and IVS11+1G>C were the most prevalent mutations with frequencies of 26.07%, 19.3%, 12.86%, 6.07%, and 3.93%, respectively. All other mutations showed a relative frequency of less than 3.5%. That study was conducted on 140 Iranian patients with classic PKU. All important regions, including 13 exons as well as exon-intron  SNPs (detected genotype) from left to right, include: rs5030849, rs62507288, rs5030845, rs62507341, rs5030843, rs62508637, rs5030855, rs5030857, rs199475575, and rs62516095; (B) electropherogram indicates the multi-detection of PAH SNPs in a homozygote patient; As can be observed, even though the mutant homozygous and normal states of the minisequencing products have both the same size, the mutant peak is later than the normal peak due to the influence of dye on the mobility shift of DNA segments; (C) electropherogram indicating the multidetection of PAH SNPs in a heterozygote person.
Detection of PKU by newborn screening and treatment at the beginning of the birth is a significant achievement in public health. The main method for detecting PAH gene mutations is DNA sequencing [29,30] but Valian et al. [31] used the PCR-RFLP method, and Bagheri et al. [32] utilized PAH variable number of tandem repeat method. Also, a combination of singlestrand conformation polymorphism and DNA sequencing methods was employed to identify PAH mutations [2] . These methods need to be confirmed with a precise technique such as DNA sequencing. Methods allowing simple, low-cost, fast, and high-throughput detection of mutations are becoming of particular interest in molecular diagnosis of genetic diseases. In one study, the minisequencing method was compared to sequencing and real-time PCR techniques [33] . Comparing the three techniques, including nextgeneration sequencing, SNaPshot, and real-time PCR showed the same results when the amount of DNA was sufficient. However, next-generation sequencing had a high cost [33] . SNaPshot technique has been also used to analyze human SNPs in criminological studies.
In this field, Snapshot technique has been mentioned as a cheap, versatile, and effective method [34,35] . The SNaPshot methodology was applied in 2014 to identify hepatitis B viral mutations, and the results revealed that this method is a reliable and affordable method for identifying the A-D genotype in this virus and is easily usable [36] . Due to the high specificity of this technique to distinguish the sequence variants, it can be used in both research and routine laboratory operations for diagnosis, especially in criminology laboratories [35,37,38] . Our study confirmed the accuracy of the results and demonstrated the capability of this technique in detecting mutations, based on the sequencing technique. However, its advantages, in addition to lower cost, were great accuracy and rapid diagnosis. Thus, it can serve as a functional technique in clinical laboratories to identify either PKU mutations or mutations in the disorders such as thalassemia and mitochondrial diseases [39] .
Based on the location and fluorescent color of the peaks on the electropherograms produced, we were able to detect the genotypes of the samples using the SNaPshot minisequencing approach. Based on the color and quantity of peaks at each mutation site in the electropherogram, minisequencing also successfully distinguished the heterozygosity from the homozygosity within the same reaction. The proper design of the multiplex minisequencing primers is essential for performance of the minisequencing assay.
In the present study, we described a simple multiplex technique for simultaneous detection of 10 SNPs located in the PAH gene. The designed assays were based on the SNaPshot minisequencing method, which was found to be a precise assay for SNP genotyping in numerous biology fields, including forensic and population genetics [14] . The GeneMapper 1.6 software 54 Iran. Biomed. J. 27 (1): 46-57 was employed to observe and analyze the peaks. Using GeneScan 120 LIZ size standards (15-120 nucleotide fragments) as a reference, each SNP minisequencedtreated fragment was given a size depending on how quickly it moved through the POP-7 polymer. A color was assigned to the individual dye-labeled ddNTP as follows: green/A, black/C, blue/G and red/T. The minisequencing reaction produced one (homozygote) or two (heterozygote) peaks depending on a SNP genotype. Homozygotes had only one peak either for the major allele or for the minor allele [25] . The relative sizes and signal colors for each allele (major or minor) of each SNP are shown in Table 4.
Although minisequencing reaction products for a specific SNP site had the same size, different electrophoretic mobilities of each incorporated dyelabelled ddNTP allowed visualization of two separate peaks and not two superimposed peaks of different colors [25] . The analyzed data showed a sufficient separation from the intervals of 4, 8, and 9 nucleotide length difference between the adjacent minisequencing primers. The stated sizes for each SNP will vary from the actual sizes (minisequencing primers sizes) by a few bases, since the dye affects the mobility shift of the DNA segments used in the POP-7. An example of a multiplex minisequencing electropherogram with 10 PAH detected SNPs for a normal homozygous, mutant homozygous and heterozygous individuals in rs5030855 is indicated in Figures 2 and 3. As represented in the two Figures, the height and color of the fluorescence peaks in 10 SNPs are different. In the normal person, according to the color of the displayed peaks, from left to right, the genotype of each SNP includes rs5030849: C/C, rs62507288: G/G, rs5030845: A/A, rs62507341: T/T, rs5030843: G/G, rs62508637: G/G, rs5030855: G/G, rs5030857: C/C, rs199475575: G/G, and rs62516095:G/G. In the mutant homozygous status for rs5030855, the genotype of the SNP includes A/A, and in the mutant heterozygote status for rs5030855, the genotype of the SNP entails G/A. Additionally, only one peak was observed in each site, and the genotype of all SNPs for a normal person was identified in a single reaction. However, two peaks were found in the SNP location for a person who was heterozygous in one SNP (Fig. 2C). Sometimes, depending on the type of mutation and the size of the primer, these two peaks overlapand the height of the peak is lower than that of the homozygous state.
As a method of polymorphism screening, the main advantage of the minisequencing assay is the simultaneous detection of many selected polymorphisms in a single reaction, with the results displayed in a single electropherogram. Each laboratory has a thermocycler and a DNA sequencer to perform the assay with ease. Noteworthy, this test would be useful in diagnostic labs having moderate to high patient sample volumes by using automated electrophoresis and subsequent data processing with the Genetic Analyzer device and Genotyper software [14] .
Using this technique for simultaneous detection of several polymorphisms at distant genetic locations in a single reaction, makes it cost-and time-efficient compared to other commonly used genotyping assays such as PCR-RFLP and Sanger DNA sequencing. Although PCR-RFLP is a common approach, genotyping the high number of SNPs selected for the two loci requires many PCR reactions, which is timeconsuming and impractical. More crucially, because some SNPs under research have no useful restriction enzyme recognition sites or have additional SNPs inside the recognition site, PCR-RFLP is not applicable